Skip to content

fix(chrome-ai): probe-gate caps + session/validation correctness (#514)#520

Merged
sroussey merged 6 commits into
chrome-aifrom
claude/libs-514-fixes-Bi1rh
May 20, 2026
Merged

fix(chrome-ai): probe-gate caps + session/validation correctness (#514)#520
sroussey merged 6 commits into
chrome-aifrom
claude/libs-514-fixes-Bi1rh

Conversation

@sroussey
Copy link
Copy Markdown
Collaborator

Stacks fixes on top of #514's chrome-ai branch. Five focused commits — one per issue.

Summary

C1 — Probe-gate tool-use and json-mode

inferWebBrowserCapabilities unconditionally advertised json-mode + tool-use for chrome-prompt/gemini-nano, but LanguageModel.create's tools and LanguageModel.prompt's responseConstraint aren't universally accepted across Chrome builds. The dispatcher could route a json-mode/tool-use task to a provider that would reject it at runtime.

  • New providers/chrome-ai/src/ai/common/WebBrowser_CapabilityProbe.ts: one-shot probe with module-level promise coalescing; smoke-tests both options independently and immediately destroys the test sessions.
  • WebBrowserProvider constructor kicks off the probe and stores result on this.probedCaps. Provider exposes ready(): Promise<void>.
  • Pre-probe: conservative subset (no json-mode, no tool-use). Post-probe: reflects browser surface.
  • inferWebBrowserCapabilities(model, probed?) defaults to {jsonMode: true, toolUse: true} for back-compat; new inferWebBrowserCapabilitiesAsync drives the probe.
  • Files: providers/chrome-ai/src/ai/common/WebBrowser_Capabilities.ts, providers/chrome-ai/src/ai/WebBrowserProvider.ts, providers/chrome-ai/src/ai/index.ts.

H1 — WebBrowser_StructuredGeneration accepts sessionId

The run-fn dropped sessionId from its signature, so successive calls with the same id always rebuilt the underlying LanguageModel.

  • Accepts sessionId as the 6th positional param.
  • Cache reuse keys on a canonical schema fingerprint (recursive key-sorted stringify) stored on ChromeChatSessionState.schemaFingerprint.
  • Schema mismatch forces a rebuild (Chrome's responseConstraint state is bound at first prompt).
  • On stream failure: drop+destroy via the same cacheWritten/dropChromeSessionEntry dance as WebBrowser_Chat.
  • File: providers/chrome-ai/src/ai/common/WebBrowser_StructuredGeneration.ts (line ~41 signature), providers/chrome-ai/src/ai/common/WebBrowser_Sessions.ts (extended ChromeChatSessionState).

H2 — WebBrowser_ToolCalling accepts sessionId

Ignored both outputSchema and sessionId.

  • Accepts both 5th + 6th positional params.
  • Sorted-tool-name fingerprint stored on ChromeChatSessionState.toolsFingerprint. Tool-set change rebuilds.
  • Correctness guard: only cache when input.messages is present. Bare-prompt callers always rebuild because Chrome appends tool-result turns to the session's internal state opaquely — reusing a cache the orchestrator hasn't fully replayed would double-feed results. Documented in code.
  • On error: drop+destroy.
  • File: providers/chrome-ai/src/ai/common/WebBrowser_ToolCalling.ts (line ~104 signature).

H3 — Validate tool-call arguments against inputSchema

callInput = (args[0] ?? {}) was forwarded verbatim; filterValidToolCalls only checked the tool name.

  • Compile each tool's inputSchema once via compileSchema from @workglow/util/schema, cached by name.
  • Validate captured args before filterValidToolCalls. Invalid → drop + getLogger().warn(...) matching the existing name-only warning style.
  • Tools whose inputSchema fails to compile log once and fall through to name-only validation (no run-level crash).
  • File: providers/chrome-ai/src/ai/common/WebBrowser_ToolCalling.ts (line ~128 execute stub + new validator pass).

H4 — Validate StructuredGeneration final JSON against outputSchema

When JSON.parse failed AND parsePartialJson returned undefined, the run-fn cast {} to the output type, emitted a finish event, and downstream code had no way to distinguish that from a legitimate empty payload.

  • compileSchema at the top of the run; compile failure → PermanentJobError("invalid outputSchema") (avoids burning retry budget on a malformed schema).
  • Unparseable final string → PermanentJobError("Chrome AI returned unparseable JSON"). No finish emitted.
  • Parsed but invalid → PermanentJobError("Chrome AI output failed schema validation: ..."). No finish emitted.
  • Only on parse+validate success is finish emitted and the cache entry written.
  • File: providers/chrome-ai/src/ai/common/WebBrowser_StructuredGeneration.ts (lines ~94, ~96 of the original; reworked).
  • Retry contract: verified StructuredGenerationTask.executeStream (in packages/ai) wraps super.executeStream(currentInput, context) in a per-attempt for-await and validates per finish, so a thrown error correctly fails the attempt and the loop retries up to maxRetries. Throwing without emitting finish is the right shape.

Test plan

  • bun test packages/test/src/test/ai-provider/WebBrowserProvider.test.ts46 tests pass (up from 19).
    • 14 inferCapabilities + probe coalescing tests.
    • 4 SG session cache tests + 3 SG validation tests.
    • 3 TC session cache tests + 4 TC argument validation tests.
  • tsgo --noEmit clean on providers/chrome-ai/ and packages/test/.
  • bunx vitest run packages/test/src/test/ai-provider — only failures are unrelated (HFT bbox unit test, llamacpp model download race) and reproduce on the base branch.
  • Manual smoke on a real Chrome build with the Prompt API enabled.
  • Manual smoke on a Chrome build without tools / responseConstraint (probe should gate them out).

Open questions

  • Probe surface: today the probe smoke-tests create({ responseConstraint }) and create({ tools }). Per spec responseConstraint actually lives on prompt() options, not create() — a build that accepts unknown create options silently could give us a false positive. The user brief explicitly asked for the create-time test for both, and reviewing the chromium types tools is a create-time option while responseConstraint is per-prompt. If we want a tighter signal we could additionally run a short promptStreaming with the constraint and read one chunk. Worth a follow-up.
  • H4 retry contract: StructuredGenerationTask.executeStream catches per-attempt errors from the run-fn? Inspected the task and confirmed it iterates per-attempt and validates on finish, so throwing without finish should retry. Could not run against a real failing model — please verify with a live Chrome AI smoke test.
  • Fingerprint storage: H1's schema fingerprint and H2's tools fingerprint live on ChromeChatSessionState. They're string-typed and unbounded — for very large schemas the canonical-stringify cost is non-trivial. If we see a hot path, hash to a fixed-length digest.
  • Worker bundle compatibility: @workglow/util/worker deliberately excludes compileSchema (json-schema-library + URI.js + nearley + json-pointer is heavyweight). H3/H4 import from @workglow/util/schema which pulls those in. bun build --packages=external keeps them external so worker startup cost grows only if the consumer actually imports. Worth confirming the worker bundle size delta is acceptable, or guarding the validation behind a no-op fallback when running in the worker entry.

Generated by Claude Code

claude added 5 commits May 20, 2026 08:55
Chrome's `LanguageModel.create` did not universally accept `tools` or
`responseConstraint` options, yet `inferWebBrowserCapabilities` always
advertised `tool-use` + `json-mode` for `chrome-prompt`/`gemini-nano`.
This caused the dispatcher to route json-mode and tool-use tasks to the
WebBrowser provider on Chrome builds that would reject them at runtime.

Adds a one-shot capability probe (`probeWebBrowserCapabilities`) that
smoke-tests `factory.create({ responseConstraint })` and
`factory.create({ tools })`, with module-level coalescing so concurrent
callers share one probe round-trip. `WebBrowserProvider` kicks the probe
off in its constructor; until it resolves, `inferCapabilities` returns
the conservative subset (no `json-mode`, no `tool-use`). Tests cover
all four probe outcome combinations, coalescing, and pre/post-ready
inference.

https://claude.ai/code/session_013PqntVCfKgKmJ5396w7BPC
…ingerprint (H1)

The structured-generation run-fn dropped `sessionId` from its signature,
so successive calls with the same id always rebuilt the underlying Chrome
`LanguageModel` even though the surface supports session reuse. This
matched the pre-session-cache behaviour rather than the post-cache shape
adopted by `WebBrowser_Chat`.

Accept `sessionId` as the 6th positional parameter, mirroring chat. Cache
reuse is gated on a canonical schema fingerprint stored on the cache
entry — a schema change forces a rebuild because Chrome's
`responseConstraint` state is bound at first-prompt and re-feeding a
different schema is undefined behaviour. On stream failure the entry is
dropped + destroyed via the same `cacheWritten` / `dropChromeSessionEntry`
dance as chat. `ChromeChatSessionState` grows an optional
`schemaFingerprint` field.

https://claude.ai/code/session_013PqntVCfKgKmJ5396w7BPC
… (H2)

`WebBrowser_ToolCalling` ignored both `outputSchema` and `sessionId` —
the 5th and 6th positional parameters of the run-fn contract — so
multi-turn tool-calling rebuilt the `LanguageModel` each turn.

Accept both parameters. Cache reuse keys on a sorted-tool-name
fingerprint (Chrome binds `tools` at `create()` time and can't hot-swap
them per turn). We only cache when the orchestrator drives via
`input.messages` because Chrome's tool-calling loop appends tool-result
turns to the session's internal state opaquely — reusing a cached
session across a turn the orchestrator hasn't fully replayed would
double-feed those results. Bare-prompt callers always rebuild.

On any error we drop + destroy the cache entry: Chrome's internal state
may be mid-tool-call-cycle. `ChromeChatSessionState` grows an optional
`toolsFingerprint` field.

https://claude.ai/code/session_013PqntVCfKgKmJ5396w7BPC
… (H3)

Chrome's `LanguageModel` invokes our stub `execute` callback with whatever
arguments the model emits. `filterValidToolCalls` only checked the tool
name, so a hallucinated arg shape was forwarded to the orchestrator
verbatim — leaving the downstream tool runner to either fail or silently
produce garbage.

Compile each tool's `inputSchema` once via `compileSchema` (cached by
name) before the stream starts. After streaming we validate every
captured call's `input` against its tool's validator; failures are
dropped + warn-logged in the same shape as `filterValidToolCalls`'s
existing name-only warning. Tools whose `inputSchema` fails to compile
emit a single warning and fall through to the name-only check rather
than failing the whole run.

https://claude.ai/code/session_013PqntVCfKgKmJ5396w7BPC
…ma (H4)

Chrome's `responseConstraint` is best-effort, not a hard guarantee — the
model can still produce a partial or shape-mismatched payload. The
existing fallback (`parsePartialJson(...) ?? {}`) handed downstream code
an empty object cast to the output type, indistinguishable from a
legitimate empty payload. Worse, that path emitted a `finish` event, so
`StructuredGenerationTask`'s retry loop had no signal to retry on.

Compile the validator once via `compileSchema`. After streaming:
 - If neither `JSON.parse` nor `parsePartialJson` produces a value:
   throw `PermanentJobError("Chrome AI returned unparseable JSON")`.
 - If validation fails: throw with the first validator error message.
 - Only on success do we emit `finish` and write the cache entry.

`StructuredGenerationTask.executeStream` catches per-attempt errors and
retries, so throwing here is the correct signal — no `finish` so the
loop knows this attempt failed. Schema compile failures are also
surfaced as `PermanentJobError` (so retries don't burn through quota on
a malformed schema).

https://claude.ai/code/session_013PqntVCfKgKmJ5396w7BPC
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 20, 2026

Open in StackBlitz

@workglow/cli

npm i https://pkg.pr.new/@workglow/cli@520

@workglow/ai

npm i https://pkg.pr.new/@workglow/ai@520

@workglow/browser-control

npm i https://pkg.pr.new/@workglow/browser-control@520

@workglow/indexeddb

npm i https://pkg.pr.new/@workglow/indexeddb@520

@workglow/javascript

npm i https://pkg.pr.new/@workglow/javascript@520

@workglow/job-queue

npm i https://pkg.pr.new/@workglow/job-queue@520

@workglow/knowledge-base

npm i https://pkg.pr.new/@workglow/knowledge-base@520

@workglow/mcp

npm i https://pkg.pr.new/@workglow/mcp@520

@workglow/storage

npm i https://pkg.pr.new/@workglow/storage@520

@workglow/task-graph

npm i https://pkg.pr.new/@workglow/task-graph@520

@workglow/tasks

npm i https://pkg.pr.new/@workglow/tasks@520

@workglow/util

npm i https://pkg.pr.new/@workglow/util@520

workglow

npm i https://pkg.pr.new/workglow@520

@workglow/anthropic

npm i https://pkg.pr.new/@workglow/anthropic@520

@workglow/bun-webview

npm i https://pkg.pr.new/@workglow/bun-webview@520

@workglow/chrome-ai

npm i https://pkg.pr.new/@workglow/chrome-ai@520

@workglow/electron

npm i https://pkg.pr.new/@workglow/electron@520

@workglow/google-gemini

npm i https://pkg.pr.new/@workglow/google-gemini@520

@workglow/huggingface-inference

npm i https://pkg.pr.new/@workglow/huggingface-inference@520

@workglow/huggingface-transformers

npm i https://pkg.pr.new/@workglow/huggingface-transformers@520

@workglow/node-llama-cpp

npm i https://pkg.pr.new/@workglow/node-llama-cpp@520

@workglow/ollama

npm i https://pkg.pr.new/@workglow/ollama@520

@workglow/openai

npm i https://pkg.pr.new/@workglow/openai@520

@workglow/playwright

npm i https://pkg.pr.new/@workglow/playwright@520

@workglow/postgres

npm i https://pkg.pr.new/@workglow/postgres@520

@workglow/sqlite

npm i https://pkg.pr.new/@workglow/sqlite@520

@workglow/supabase

npm i https://pkg.pr.new/@workglow/supabase@520

@workglow/tf-mediapipe

npm i https://pkg.pr.new/@workglow/tf-mediapipe@520

commit: b6e3cfe

@github-actions
Copy link
Copy Markdown

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 62.17% 22881 / 36801
🔵 Statements 62.04% 23673 / 38152
🔵 Functions 63.14% 4310 / 6826
🔵 Branches 50.74% 11100 / 21876
File CoverageNo changed files found.
Generated in workflow #2313 for commit b6e3cfe by the Vitest Coverage Report Action

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the @workglow/chrome-ai provider by (1) probing Chrome Prompt API feature support before advertising json-mode / tool-use, and (2) fixing session reuse + schema validation correctness for Structured Generation and Tool Calling run functions.

Changes:

  • Add a module-level capability probe (coalesced) and wire it into WebBrowserProvider with a ready() hook and conservative pre-probe capability inference.
  • Fix sessionId handling and cache invalidation rules for WebBrowser_StructuredGeneration and WebBrowser_ToolCalling, including schema/toolset fingerprinting.
  • Add schema validation for Tool Calling args (inputSchema) and Structured Generation final JSON (outputSchema), plus expand provider test coverage substantially.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
providers/chrome-ai/src/ai/WebBrowserProvider.ts Kicks off capability probing in the constructor; exposes ready(); gates inferred capabilities using probed results.
providers/chrome-ai/src/ai/index.ts Extends _testOnly exports for probe helpers and run-fns to support new tests.
providers/chrome-ai/src/ai/common/WebBrowser_ToolCalling.ts Adds sessionId support, toolset fingerprinting + caching rules, and validates tool-call args against each tool’s inputSchema.
providers/chrome-ai/src/ai/common/WebBrowser_StructuredGeneration.ts Adds sessionId support, schema fingerprinting + caching, and validates final JSON against outputSchema with PermanentJobError failures.
providers/chrome-ai/src/ai/common/WebBrowser_Sessions.ts Extends cached session state to store schema/tool fingerprints alongside the session + message watermark.
providers/chrome-ai/src/ai/common/WebBrowser_CapabilityProbe.ts New probe module that smoke-tests optional Chrome Prompt API surfaces and caches the result.
providers/chrome-ai/src/ai/common/WebBrowser_Capabilities.ts Updates capability inference to conditionally include json-mode / tool-use; adds async inference helper.
packages/test/src/test/ai-provider/WebBrowserProvider.test.ts Adds extensive tests covering probe behavior/coalescing, caching correctness, and schema validation behaviors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread providers/chrome-ai/src/ai/common/WebBrowser_StructuredGeneration.ts Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@sroussey sroussey merged commit 917c34f into chrome-ai May 20, 2026
2 of 3 checks passed
@sroussey sroussey deleted the claude/libs-514-fixes-Bi1rh branch May 20, 2026 15:17
sroussey added a commit that referenced this pull request May 22, 2026
…apability probe

Integrates the chrome-ai branch (7 commits — PR #514/#520/#528) with main's
parallel chrome-ai work (model.download, model.dispose, ApiBinding):

- Chat-session cache keyed by AiChatTask sessionId, with messageCount
  high-water mark for reuse (replaces fingerprint-based invalidation)
- StructuredGeneration + ToolCalling run-fns gated by an async capability
  probe; pre-probe state advertises a conservative subset (no json-mode,
  no tool-use) so the provider never claims a capability it can't fulfil
- ChatHistory helpers + WebBrowser_TextGeneration_Unified dispatcher
  (text.generation shared by AiChatTask + TextGenerationTask)
- ChromeHelpers ships both assertAvailability and ensureAvailable; both
  session APIs (chrome-chat cache + idle-evict store) coexist
- Drops main's WebBrowser_Chat.test.ts (chrome-ai's WebBrowserProvider.test
  already covers chat behavior under the new cache semantics)
sroussey added a commit that referenced this pull request May 22, 2026
…viders

Addresses review of #514/#520/#528 rebase:

CRITICAL fix — `model.dispose` now reaches chat-cached sessions. The
post-rebase chrome-ai branch had two parallel session maps
(`chromeSessions` for chat reuse, `sessions` for idle-evict +
ModelDispose lookup) but only the chat map was populated by runtime
code, making `model.dispose` a functional no-op in production.

Unified into a single Map<sessionId, WebBrowserSessionEntry> with both
chat-cache fields (messageCount, fingerprints) and lifecycle fields
(modelKey, lastUsedAt, idleTimer). `ChromeChatSessionState` now requires
`modelKey`. `disposeWebBrowserSessionsForModel(modelKey)` iterates the
unified store, so model.dispose destroys chat-cached sessions. Chat
sessions become subject to idle eviction (free bonus).

IMPORTANT — sanitizeToolArgs applied across the codebase per intent of
the prior refactor:

  - OpenAIShapedChat (parseOpenAIToolCallMessage + accumulateOpenAIStream)
    → covers OpenAI + HFI
  - ToolCallParsers (adaptParserResult + parseToolCallsFromText)
    → covers llama.cpp Hermes/Liquid/Qwen35/Llama paths + HFT
  - Anthropic_ToolCalling (input_json_delta + content_block_stop)
  - Gemini_ToolCalling (functionCall.args)
  - Ollama_ToolCalling (parsed function.arguments)
  - LlamaCpp_ToolCalling (extractNativeFunctionCalls)
  - Cactus_ToolCalling[.browser] (JSON-parse parseToolCalls paths)

Every model-supplied tool-arg payload now passes through
sanitizeToolArgs before reaching downstream consumers, closing the
prototype-pollution vector across the provider matrix.

Also:
  - Added packages/test/src/test/ai/ToolCallingUtils.test.ts (14 unit
    tests for sanitizeToolArgs, compileToolValidators,
    validateToolCallArgs, plus a sanitize→validate→name-check
    integration test).
  - Added WebBrowser_Sessions.test regression for the unified-store
    behavior (disposeWebBrowserSessionsForModel sees chat-cached
    entries).
  - Documented WebBrowser_Chat's rebuild-on-next-turn recovery model
    (vs the in-fn retry that main's now-deleted test exercised).
sroussey added a commit that referenced this pull request May 22, 2026
* feat(chrome-ai): chat history, tool calling, structured generation, capability probe

Integrates the chrome-ai branch (7 commits — PR #514/#520/#528) with main's
parallel chrome-ai work (model.download, model.dispose, ApiBinding):

- Chat-session cache keyed by AiChatTask sessionId, with messageCount
  high-water mark for reuse (replaces fingerprint-based invalidation)
- StructuredGeneration + ToolCalling run-fns gated by an async capability
  probe; pre-probe state advertises a conservative subset (no json-mode,
  no tool-use) so the provider never claims a capability it can't fulfil
- ChatHistory helpers + WebBrowser_TextGeneration_Unified dispatcher
  (text.generation shared by AiChatTask + TextGenerationTask)
- ChromeHelpers ships both assertAvailability and ensureAvailable; both
  session APIs (chrome-chat cache + idle-evict store) coexist
- Drops main's WebBrowser_Chat.test.ts (chrome-ai's WebBrowserProvider.test
  already covers chat behavior under the new cache semantics)

* refactor(ai,chrome-ai,openai,hfi): shared tool sanitation; emit-pattern streams

Tool calling utilities (packages/ai/src/task/ToolCallingUtils.ts):
- sanitizeToolArgs: recursive __proto__/constructor/prototype scrubbing
  for model-supplied tool args (prototype-pollution defence)
- compileToolValidators + validateToolCallArgs: per-tool inputSchema
  validation with graceful fallback for tools whose schema fails to compile

Stream helpers converted from generators to emit-callback so run-fns no
longer need a for-await/yield pump:
- snapshotStreamToTextDeltas / snapshotStreamToSnapshots (chrome-ai)
- accumulateOpenAIStream (@workglow/ai provider-utils, used by OpenAI + HFI)

Run-fns updated to call helpers with emit directly and emit their own
final 'finish' event. chrome-ai's WebBrowser_ToolCalling drops its
private sanitization + validation copy and reuses the shared utils.

* fix(chrome-ai): wire model.dispose; apply sanitizeToolArgs across providers

Addresses review of #514/#520/#528 rebase:

CRITICAL fix — `model.dispose` now reaches chat-cached sessions. The
post-rebase chrome-ai branch had two parallel session maps
(`chromeSessions` for chat reuse, `sessions` for idle-evict +
ModelDispose lookup) but only the chat map was populated by runtime
code, making `model.dispose` a functional no-op in production.

Unified into a single Map<sessionId, WebBrowserSessionEntry> with both
chat-cache fields (messageCount, fingerprints) and lifecycle fields
(modelKey, lastUsedAt, idleTimer). `ChromeChatSessionState` now requires
`modelKey`. `disposeWebBrowserSessionsForModel(modelKey)` iterates the
unified store, so model.dispose destroys chat-cached sessions. Chat
sessions become subject to idle eviction (free bonus).

IMPORTANT — sanitizeToolArgs applied across the codebase per intent of
the prior refactor:

  - OpenAIShapedChat (parseOpenAIToolCallMessage + accumulateOpenAIStream)
    → covers OpenAI + HFI
  - ToolCallParsers (adaptParserResult + parseToolCallsFromText)
    → covers llama.cpp Hermes/Liquid/Qwen35/Llama paths + HFT
  - Anthropic_ToolCalling (input_json_delta + content_block_stop)
  - Gemini_ToolCalling (functionCall.args)
  - Ollama_ToolCalling (parsed function.arguments)
  - LlamaCpp_ToolCalling (extractNativeFunctionCalls)
  - Cactus_ToolCalling[.browser] (JSON-parse parseToolCalls paths)

Every model-supplied tool-arg payload now passes through
sanitizeToolArgs before reaching downstream consumers, closing the
prototype-pollution vector across the provider matrix.

Also:
  - Added packages/test/src/test/ai/ToolCallingUtils.test.ts (14 unit
    tests for sanitizeToolArgs, compileToolValidators,
    validateToolCallArgs, plus a sanitize→validate→name-check
    integration test).
  - Added WebBrowser_Sessions.test regression for the unified-store
    behavior (disposeWebBrowserSessionsForModel sees chat-cached
    entries).
  - Documented WebBrowser_Chat's rebuild-on-next-turn recovery model
    (vs the in-fn retry that main's now-deleted test exercised).

* feat(chrome-ai): retry once on InvalidStateError when a cached session is destroyed

Chrome can destroy a `LanguageModel` session out from under us (tab
backgrounding, GPU process restart, memory pressure). When a cached
session's `promptStreaming` throws DOMException("...destroyed...",
"InvalidStateError") we now rebuild the session from full history via
`initialPrompts` and retry the prompt once.

Retry is gated on three conditions, all required:
  - We were using a CACHED session (a fresh-session failure means the
    model is broken; retrying won't help).
  - No text-delta has reached the consumer yet (we can't unsend deltas).
  - The error name is `InvalidStateError` (matches Chrome's
    InvalidStateError DOMException; tolerant of message-text changes).

Tests:
  - "retries once with a fresh session when a cached session is destroyed"
    seeds the cache on turn 1, has the cached session's promptStreaming
    throw on turn 2's reuse, asserts rebuild + retry + cache replacement.
  - "does not retry when a fresh (non-cached) session fails" guards the
    first gate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants